Telegram Group & Telegram Channel
مقایسه زمانی BPE Tokenizer روی دو کتابخونه Hugging Face Tokenizers و OpenAI TikToken روی ولیدیشن دیتاست تاینی‌استوریز:

dataset = load_dataset("roneneldan/TinyStories")
texts = dataset["validation"]["text"]

# Load the GPT-2 tokenizer for both libraries
tiktokenizer = tiktoken.get_encoding("gpt2") # tiktoken
hf_tokenizer = Tokenizer.from_pretrained("gpt2") # Hugging Face tokenizers

# Measure tiktoken speed
start_time = time.time()
tiktoken_results = [tiktokenizer.encode(text) for text in texts]
tiktoken_time = time.time() - start_time

# Measure tokenizers speed
start_time = time.time()
hf_results = [hf_tokenizer.encode(text).ids for text in texts]
hf_time = time.time() - start_time

# Print results
print(f"tiktoken Time: {tiktoken_time:.4f} seconds")
print(f"tokenizers Time: {hf_time:.4f} seconds")

tiktoken Time: 2.6481 seconds
tokenizers Time: 16.7744 seconds



tg-me.com/pytorch_howsam/671
Create:
Last Update:

مقایسه زمانی BPE Tokenizer روی دو کتابخونه Hugging Face Tokenizers و OpenAI TikToken روی ولیدیشن دیتاست تاینی‌استوریز:

dataset = load_dataset("roneneldan/TinyStories")
texts = dataset["validation"]["text"]

# Load the GPT-2 tokenizer for both libraries
tiktokenizer = tiktoken.get_encoding("gpt2") # tiktoken
hf_tokenizer = Tokenizer.from_pretrained("gpt2") # Hugging Face tokenizers

# Measure tiktoken speed
start_time = time.time()
tiktoken_results = [tiktokenizer.encode(text) for text in texts]
tiktoken_time = time.time() - start_time

# Measure tokenizers speed
start_time = time.time()
hf_results = [hf_tokenizer.encode(text).ids for text in texts]
hf_time = time.time() - start_time

# Print results
print(f"tiktoken Time: {tiktoken_time:.4f} seconds")
print(f"tokenizers Time: {hf_time:.4f} seconds")

tiktoken Time: 2.6481 seconds
tokenizers Time: 16.7744 seconds

BY PyTorch Howsam


Warning: Undefined variable $i in /var/www/tg-me/post.php on line 283

Share with your friend now:
tg-me.com/pytorch_howsam/671

View MORE
Open in Telegram


PyTorch Howsam Telegram | DID YOU KNOW?

Date: |

Telegram Be The Next Best SPAC

I have no inside knowledge of a potential stock listing of the popular anti-Whatsapp messaging app, Telegram. But I know this much, judging by most people I talk to, especially crypto investors, if Telegram ever went public, people would gobble it up. I know I would. I’m waiting for it. So is Sergei Sergienko, who claims he owns $800,000 of Telegram’s pre-initial coin offering (ICO) tokens. “If Telegram does a SPAC IPO, there would be demand for this issue. It would probably outstrip the interest we saw during the ICO. Why? Because as of right now Telegram looks like a liberal application that can accept anyone - right after WhatsApp and others have turn on the censorship,” he says.

Mr. Durov launched Telegram in late 2013 with his brother, Nikolai, just months before he was pushed out of VK, the Russian social-media platform he founded. Mr. Durov pitched his new app—funded with the proceeds from the VK sale—less as a business than as a way for people to send messages while avoiding government surveillance and censorship.

PyTorch Howsam from tw


Telegram PyTorch Howsam
FROM USA